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Abstract 

Background: The robustness of speech perception in the face of acoustic variation is founded on the ability of the 
auditory system to integrate the acoustic features of speech and to segregate them from background noise. This 
auditory scene analysis process is facilitated by top-down mechanisms, such as recognition memory for speech 
content. However, the cortical processes underlying these facilitatory mechanisms remain unclear. The present 
magnetoencephalography (MEG) study examined how the activity of auditory cortical areas is modulated by 
acoustic degradation and intelligibility of connected speech. The experimental design allowed for the comparison 
of cortical activity patterns elicited by acoustically identical stimuli which were perceived as either intelligible or 
unintelligible. 

Results: In the experiment, a set of sentences was presented to the subject in distorted, undistorted, and again in 
distorted form. The intervening exposure to undistorted versions of sentences rendered the initially unintelligible, 
distorted sentences intelligible, as evidenced by an increase from 30% to 80% in the proportion of sentences 
reported as intelligible. These perceptual changes were reflected in the activity of the auditory cortex, with the 
auditory N1m response (-100 ms) being more prominent for the distorted stimuli than for the intact ones. In the 
time range of auditory P2m response (>200 ms), auditory cortex as well as regions anterior and posterior to this 
area generated a stronger response to sentences which were intelligible than unintelligible. During the sustained 
field (>300 ms), stronger activity was elicited by degraded stimuli in auditory cortex and by intelligible sentences in 
areas posterior to auditory cortex. 

Conclusions: The current findings suggest that the auditory system comprises bottom-up and top-down processes 
which are reflected in transient and sustained brain activity. It appears that analysis of acoustic features occurs 
during the first 100 ms, and sensitivity to speech intelligibility emerges in auditory cortex and surrounding areas 
from 200 ms onwards. The two processes are intertwined, with the activity of auditory cortical areas being 
modulated by top-down processes related to memory traces of speech and supporting speech intelligibility. 

Keywords: Acoustic distortion, Auditory evoked magnetic fields, Auditory cortex, Human, Intelligibility, 
Magnetoencephalography, N1m, P2m, Speech processing, Sustained field 



Background 

Successful comprehension of connected speech can be 
seen as an auditory scene analysis problem [1] involving 
the matching of the acoustic properties of the incoming 
voice signal with memory representations of speech. 
This is not a linear bottom-up process, but one that can 
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be modified in a top-down fashion by long-term mem- 
ory traces of, for example, ones native language mediat- 
ing syntactic and semantic content [2,3] as well as 
information concerning affective aspects of the speaker 
[4]. Prior expectations of the stimuli can dramatically 
alter the perception of acoustically distorted speech, ren- 
dering initially unintelligible stimuli entirely comprehen- 
sible (see e.g. [5]). Despite increasing efforts in the study 
of the neural basis of speech perception, it has proven 
difficult to distinguish between the cortical processes 
related to the bottom-up extraction of acoustic features 
of speech and facilitating top-down mechanisms, such as 
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recognition memory for linguistic content. One reason 
for this is that in most studies the effects related to 
speech comprehension have been investigated by com- 
paring brain responses to intact speech with those eli- 
cited by acoustically degraded versions of the stimuli. 
Having two or more acoustically different types of stim- 
uli poses a major problem: Acoustic variability in itself 
leads to differences in the response and, consequently, 
the relative contributions of acoustic feature processing 
and speech comprehension become confounded and, 
thus, it is difficult to isolate the effects specific to speech 
comprehension. An experimental paradigm which con- 
trols for acoustic variability would appear to be needed 
in order to simultaneously capture both the bottom-up 
and top-down aspects of speech comprehension. 

In electro- and magnetoencephalography (EEG & MEG, 
respectively) recordings of the human brain, the presenta- 
tion of short-duration (<200 ms) auditory stimuli typically 
results in transient responses and, in the case of long- 
duration stimuli (>300 ms), the transient responses are 
followed by a sustained response. These responses are sui- 
ted for revealing both the spatial and temporal evolution 
of cortical activity. Notably, the transient auditory Nlm 
response generated in the auditory cortex [6] is sensitive 
to multiple aspects of speech such as fundamental fre- 
quency [7], formant transitions [8], intonation [9], place of 
articulation [10,11], the periodic structure of vowel sounds 
[12-15], and the phonetic features of consonants [16]. 
These observations indicate that the auditory cortex car- 
ries out parallel bottom-up processing of the acoustic 
properties of speech sounds, independently of the subject s 
attentional focus. Further, the Nlm has also been found to 
be a sensitive measure of the extraction process with 
which the human brain segregates speech signals from 
various types of noise contributions [17-19]. While these 
EEG/MEG studies, utilizing short (-200 ms) isolated 
vowel sounds in no-task (passive) recording conditions, 
have revealed the link between transient activation and 
the bottom-up extraction mode in acoustic feature proces- 
sing, the role of top-down influences on the processing of 
meaningful speech has remained largely unaddressed. 
Therefore, the above studies should be complemented by 
investigations focusing on the sustained activity elicited by 
connected speech, thus potentially revealing how activity 
spreads to multiple cortical brain areas performing speech 
processing. 

A growing body of hemodynamic evidence suggests that 
the cortical processes underlying speech comprehension 
operate in a hierarchical fashion, with the auditory cortex 
activated by the acoustic properties of sounds (see, e.g. 
[20]), and the regions anterior and posterior to auditory 
cortex being sensitive to the intelligibility of speech 
[21-28]. The anatomical areas underlying the speech com- 
prehension network have been investigated by contrasting 



cortical responses to speech with responses to closely 
matched non-speech stimuli, such as noise-vocoded 
sounds [21,23-26] or tonal stimuli [27,29]. These studies 
indicate that areas in the superior temporal sulcus (STS) 
respond more vigorously to speech than to non-speech 
stimuli, and that STS regions anterior and posterior to 
auditory cortex are sensitive to the intelligibility of the 
stimuli. In addition, when speech stimuli with different 
levels of intelligibility are presented, overlapping temporal 
cortical areas seem to be activated during passive listening 
[26], active listening [23,24], and active recognition tasks 
[21,25,28]. 

Currently, the evidence pertaining to the role of auditory 
cortex seems to be contradictory, with findings indicating 
that this brain region is either sensitive [30] or insensitive 
to speech intelligibility (e.g. [22,25]). Given that top-down 
information - such as prior expectations of the stimuli - 
can substantially alter the perception of degraded speech 
[5], it seems plausible that the extraction of acoustic cues 
from the distorted signal might be enhanced through feed- 
back connections from higher-order cortical areas to 
auditory cortex. As a result, these changes in cortical pro- 
cessing might be observed in the activity of the auditory 
cortex as well. Given that hemodynamic measures lack in 
temporal acuity, the millisecond resolution of EEG/MEG 
measures of transient and sustained brain activity (pre- 
sumably confounded in hemodynamic measurements) 
might provide complementary information to fMRI find- 
ings on this issue. 

The current MEG study assesses how the intelligibility 
of connected speech is reflected in the temporal evolu- 
tion of activity in auditory cortex and surrounding areas. 
The study capitalizes on the phenomenon that the intel- 
ligibility of speech signals can be manipulated without 
changing the acoustic structure of the stimulus. Accor- 
dingly, the experiment consisted of three consecutive 
sessions during which the same set of sentences was pre- 
sented in distorted, undistorted, and - once again - in 
distorted form. As a result, acoustically identical dis- 
torted stimuli were perceived as either unintelligible or 
intelligible, depending on whether the subject had previ- 
ously been exposed to an undistorted, intact version of 
the sentence. Any possible change in brain activity, then, 
cannot be attributed to changes in the acoustic structure 
of the stimuli but, rather, to the top-down processes 
related to speech comprehension. We hypothesized that 
the activity generated in auditory cortex and reflected in 
the Nlm and P2m responses would be sensitive to the 
acoustic variation in the speech signal [17-19] whereas 
the sustained activity following transient activity might 
be responsive to whether speech is intelligible, and less 
sensitive to the acoustic attributes of the stimuli. Given 
the novelty of the proposed experimental paradigm, our 
aim was to proceed with caution and to provide a 
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tentative description of brain events related to speech 
intelligibility uncontaminated by the effects caused by 
attentional engagement (arousal level, sustained atten- 
tion, etc.) as well as by planning and execution of motor 
responses. Therefore, the current study focuses on brain 
activity obtained in the passive recording condition. 

Methods 

Participants 

Ten healthy, right-handed volunteers (age range 19-28 
years; mean=22.0; SD=2.93) participated in the study 
with written informed consent. In a pre-measurement 
questionnaire, all the participants declared themselves to 
be native, right-handed Finnish speakers with normal 
hearing. The experiments were approved by the Ethical 
Committee of the Helsinki University Central Hospital. 

Stimulus material 

The stimuli were created from speech data spoken in 
Finnish by a professional logopaedist. The recordings 
were made in an anechoic chamber with a high-quality 
condenser microphone (Bruel&Kjaer 4188). The speech 
waveforms were sampled at a rate of 22.05 kHz with an 
amplitude resolution of 16 bits. To remove any low- 
frequency fluctuation picked up by the microphone, the 
signals were high-pass filtered with a 6th order Butter- 
worth filter (cut-off frequency at 60 Hz). 

The speech data recorded consisted of 84 Finnish sen- 
tences, comprising six to seven words (3-4.4 s in dur- 
ation) of the Finnish language. The sentences were 
constructed from three parts, including seven starting 
words, three sentence stubs, and four ending words. The 
starting and ending word was always a noun whereas 
sentence stubs involved nouns, objectives and verbs. In 
order to prevent any transient, time-locked activity 
occurring in the averaged data during the sustained field 
time range (300-3000 ms), the sentence stubs were 
constructed so that the acoustic structure of all the sen- 
tence stubs deviated from each other. This procedure 
resulted in a set of syntactically correct sentences which 
included both semantically meaningful (e.g. "The news 
broke out that a street was built into the village") as well 
as meaningless (e.g. "The news looked as strange as a 
bottled street") exemplars. In order to arrive at a suffi- 
cient signal-to-noise ratio (SNR) in MEG recordings, 
each subject was presented with a total of 120 sentences, 
with 36 random repetitions from the original 84- 
sentence set. 

The quality of the generated sentences was degraded 
by decreasing the amplitude resolution of the temporal 
waveform using the uniform scalar quantization (USQ) 
technique [31]. In this procedure, the maximum abso- 
lute value of each sentence is first determined. By using 
this value, the sentence is scaled to cover the full 



dynamics of the 16-bit amplitude scale, that is, each 
signal sample is rounded off to its nearest integer num- 
ber and there are altogether 65536 such integers be- 
tween the smallest negative value (-32768) and the 
largest positive value (32767). A distorted version of 
each sentence is then computed by using 1-bit quan- 
tization, in which the number of quantization levels is 
radically reduced from 65536 to just two. All in all, this 
quantization procedure yielded two versions of each sen- 
tence which, in the following, will be referred to as the 
undistorted (16-bit) and distorted (1-bit) sentence. After 
the quantization, the intensity (in terms of square sum 
of time-domain signal values) of the undistorted and dis- 
torted sentence was equalized and, finally, the onsets 
and offsets of the stimuli were smoothed with a 5-ms 
Hann window. Examples of waveforms can be found in 
[17-19]. 

Experimental design 

The experiment was designed to study the effects of 
acoustic degradation and intelligibility of connected 
speech on cortical activity (Figure 1). During the experi- 
ment, the participants first listened to the distorted sen- 
tences (Session 1), then to the undistorted versions of 
these sentences (Session 2), and finally to the distorted 
sentences again (Session 3). The offset-to-onset inter- 
sentence interval was 4 seconds. The subjects were 
instructed to focus their gaze on a fixation cross while 
listening. The sentence was followed by a 1-sec break. 
Subsequently, a question screen inquiring whether the 
sentence was intelligible or not was displayed, and the 
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Figure 1 Experimental design. The experiment consisted of three 
sessions in which the subjects were presented with acoustically 
distorted sentences (Session 1), followed by undistorted sentences 
(Session 2), after which the distorted sentences were presented 
once again (Session 3). After each sentence, the subjects indicated 
whether they had understood the sentence by pressing a Yes/No 
response key. 
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subject responded with a button press (Yes/No forced- 
choice task) within a 3-sec time window. After the active 
recording condition, the sentences were presented in the 
passive condition during which the subjects were under 
instruction to watch a silent movie while ignoring the 
auditory stimuli. 

Intelligibility of noise-distorted speech has been stud- 
ied widely in psychoacoustics and speech communica- 
tion technology, and both objective and subjective 
assessment measures of speech intelligibility have been 
adopted. Objective measures to predict speech intelligi- 
bility in the presence of noise involve, for example, the 
articulation index [32], the speech transmission index 
[33], and the speech intelligibility index [34]. Subjective 
measures make use of listeners and various types of 
speech material, either synthetic or natural, involving 
stimuli such as phones, words, sentences and sometimes 
even free conversation. In subjective evaluation, speech 
stimuli are typically played only once to listeners who 
then write down what they believe to have understood 
when listening to the corrupted utterances. An intelligi- 
bility score is then computed as a percentage of speech 
elements correctly reported. Subjective intelligibility 
scores used in speech perception studies include, for ex- 
ample, word error rates [35] as well as sentence and 
consonant identification rates [36]. In the current study, 
however, neither objective measures nor the subjective 
scoring methods mentioned above were used in the 
evaluation of speech intelligibility. Instead, subjective in- 
telligibility was evaluated by asking the subject to grade 
the speech sentence in a binary manner as either com- 
pletely intelligible or unintelligible. By choosing the 
former, the subject indicated that she/he had understood 
the sentence correctly. By choosing the latter, the subject 
indicated that she/he was unable to understand the sen- 
tence. Instead of counting quantitatively the percentage 
of words correctly recognized, this binary subjective in- 
telligibility score measures solely whether the subject 
considers that she/he was able to understand the mean- 
ing of the heard sentence. Therefore, it is a genuinely 
subjective means to evaluate intelligibility of corrupted 
speech and well suited for use in the present study to in- 
vestigate speech processing in human subjects. 

Data acquisition 

Auditory evoked fields were recorded using a 306-sensor 
whole-head MEG device (Vectorview 4-D, Neuromag 
Oy, Finland) in a magnetically shielded room. The re- 
corded signals obtained by the 204 gradiometers were 
sampled at a rate of 0.6 kHz and low-pass filtered online 
with a cut-off frequency of 172 Hz. Horizontal and verti- 
cal eye movements were monitored by two electrode 
pairs. The position of the participants head with respect 
to the MEG sensor array was determined with four 



head-position indicator (HPI) coils before the beginning 
of each measurement session. The HPI coils were loca- 
lized with respect to the nasion and the preauricular 
points using a 3-D digitizer. The head-based coordinate 
system was defined by the #-axis passing through the 
preauricular points (positive to the right), the y-axis 
passing through the nasion (positive to the front), and 
the z-axis as the vector cross product of the x- and y- 
unit vectors. The participant was instructed to remain 
stationary during the experiment. The auditory stimuli 
were binaurally delivered to the subjects ears through 
plastic tube earphones with an average intensity of 65 
dB SPL. 

Offline preprocessing of the MEG data 

In order to exclude gradiometer sensors with recording 
artifacts or a low SNR, the recorded data was first visu- 
ally inspected. To suppress magnetic noise, spatio- 
temporal signal-space separation (Maxwell filtering with 
temporal extension) was performed for the data using 
Elekta Neuromag MaxFilter. Two data sets with differ- 
ent filtering and averaging parameters were extracted 
from the data, one for examining the transient Nlm 
and P2m responses, and the other for studying the sus- 
tained field (SF). The data set for the transient response 
analysis was filtered with a 2-30 Hz band-pass filter, 
and the data set for the sustained field analysis with a 
30 Hz low-pass filter. 

For averaging the data, the epochs were time-locked to 
the beginning of the stimuli, and amplitude-corrected 
using a 100-ms pre-stimulus baseline. Epochs with ex- 
cessive magnetic field gradient amplitudes (over 2000 
fT/cm) or with large eye movements (electro-oculogram 
threshold =150 uV) were excluded from the average. A 
500-ms window was used in averaging the transient 
response data, and a 3000-ms window for the sustained 
field data. The data of one subject was discarded due to 
a weak SNR and eye-movement artifacts. Furthermore, 
the data of another subject was excluded from the ana- 
lyses of the sustained field due to an insufficient number 
of averaged epochs. 

MEG data analysis 

Latencies, amplitudes and source localization of the 
transient responses 

The latencies and amplitudes of the Nlm and P2m 
responses were determined separately in each hemi- 
sphere from the peak values of the responses. The peak 
amplitude was calculated as the maximum magnitude of 
the vector sum from the sensor pair exhibiting the max- 
imum response within a 90-140 ms time window for the 
Nlm, and a 150-250 ms window for the P2m response. 

The source locations of the Nlm and P2m responses 
were estimated by equivalent current dipole (ECD) 
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analyses [37] conducted separately in the left and right 
hemisphere. A single ECD was fitted to the data at the 
Nlm and P2m peak latency using a subset of 44 gradi- 
ometer sensors covering the temporal areas of each 
hemisphere. A spherical model was used to model the 
conductivity of the head. The ECD analyses were carried 
out using the Elekta Neuromag Xfit Source Modeling 
Software. 

Amplitude of the sustained field 

The sensor pairs yielding the maximal transient res- 
ponses also exhibited prominent sustained fields. The 
amplitude of the sustained field (SF) was quantified as 
the mean amplitude of the vector sum over a predefined 
time interval. Two distinct phases were observed in the 
AEF waveform: a large deflection within the 300-1000 
ms time window, and a relatively static part within the 
1000-3000 ms time interval. Thus, in each hemisphere, 
the amplitude of the sustained field was calculated sep- 
arately over the 300-1000 ms (early SF), and 1000-3000 
ms (late SF) time intervals. 

Current distribution estimates 

To study the spatial distribution of cortical activity, 
noise-normalized minimum-norm estimates (MNEs) 
were calculated. To this end, noise-covariance matrices 
were computed from the 100-ms pre-stimulus baselines 
of the individual epochs in the Maxwell-filtered raw 
data. The forward solutions and the inverse operators 
were calculated for each session by employing a 
boundary-element model computed using average head 
and skull surface reconstructions provided with the 
MNE-Suite Software. 

MNE and sLORETA [38] estimates were computed at 
5-ms intervals for the data of each individual subject. To 
study the current distribution during the transient 
responses, both estimates were averaged over 40-ms 
time windows centered at the peaks of the Nlm and 
P2m responses. The peak latencies of the responses were 



obtained from the gradiometer analyses (see above). 
Fixed time windows of 300-1000 ms (early SF) and 
1000-3000 ms (late SF) were used in analyzing the 
current distribution during the sustained field responses. 

Region of interest analysis 

The cortical surface used in calculating the MNE esti- 
mates was divided into 24 regions of interest (ROIs), 12 
regions in each hemisphere. The ROIs used in the ana- 
lyses are depicted in Figure 2. The ROIs were labeled 
according to their physical location: anterior/central/ 
posterior, superior/inferior, and temporal/parietal region. 
The ROIs were selected from the parietal and temporal 
cortices with emphasis on examining the spread of acti- 
vation from the primary auditory cortical areas within 
the superior temporal gyrus to regions anterior and pos- 
terior to the auditory cortex, and to the parietal regions 
[39,40]. The ROIs were centered on the auditory cortical 
areas within the superior temporal gyrus (CST in the 
current notation, see Figure 2). We emphasize that due 
to a lack of individual structural MRI data, the ROIs 
should be regarded as approximations of particular cor- 
tical areas only. For example, Wernicke's and Brocas 
area lie roughly within PST and AST, respectively, and 
motor areas within the pre- and postcentral gyrus can 
be found in AIP and CIP, respectively [41]. The mean 
current for each ROI was calculated separately for the 
transient and sustained responses as the average of the 
MNEs over all the voxels within each ROI (see previous 
subsection; see also [42]). The MNE values were extracted 
from the original MNEs without noise normalization. 

Statistical analyses 

Repeated-measures analysis of variance (ANOVA) was 
used to analyze 1) the amplitudes, latencies, and source 
location of the transient responses, 2) the mean ampli- 
tudes of the sustained responses, and 3) the mean currents 
within each ROI. All the ANOVAs were of the design 
2x2x3, with the within-subjects factors of hemisphere 
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Figure 2 Regions of interest (ROIs) used for calculating the mean currents in the MNE analyses. The grid provides a rough parcellation of 
brain areas in the anterior/central/posterior, inferior/superior, and parietal/temporal dimensions. 
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(left/right), recording condition (active/passive), and deg- 
radation (distorted first presentation/undistorted/distorted 
second presentation). The effects of intelligibility (based on 
the current behavioral results) on the cortical activity mea- 
sures were analyzed by post-hoc comparisons (Newman- 
Keuls test) of the degradation factor levels. 

Results 

Intelligibility of the sentences 

During the MEG measurements, the subjects first lis- 
tened to sentences acoustically degraded through ampli- 
tude quantization. These were then followed by the 
same set of sentences in undistorted sound quality. 
Finally, the subjects heard the degraded sentences again 
(see Figure 1). Subjective intelligibility, that is, the pro- 
portion of sentences the subject reported having under- 
stood was 94.8% (±0.9%) for the undistorted sentences, 
and it increased substantially between the first and the 
second presentation of the degraded sentences for all of 
the subjects (Figure 3). The mean proportion of sen- 
tences reported as intelligible was 48.7 percentage points 
lower for the first presentation (30.2%; SEM: ±7.6%) than 
for the second presentation of the degraded sentences 
(78.9%; SEM: ±3.7%; F(l,7)=43.26, ^<0.0005). This in- 
crease in subjective intelligibility from 30% to 80% for 
acoustically identical stimuli demonstrates how a single 
presentation of intact speech material can drastically 
alter the subjects ability to comprehend degraded con- 
nected speech. 

Activity in auditory cortex 

In the first stage of the analyses, the local activation of 
the auditory cortex was studied separately in the left and 




Distorted Undistorted Distorted 
1st presentation 2nd presentation 



Figure 3 Behavioral results. In the case of the first presentation of 
the distorted sentences, the stimuli were very difficult to understand 
(mean subjective intelligibility rating = 30.2%). After an intervening 
presentation of the same sentences in undistorted form, the 
comprehensibility of the distorted sentences increased considerably 
(78.9%). Error bars indicate the standard error of the mean (SEM). 
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right hemisphere by using the pair of gradiometer sen- 
sors in each hemisphere exhibiting the largest responses. 
Prominent Nlm and P2m responses were elicited at the 
beginning of all sentences, as depicted in Figure 4, which 
shows the first 300 ms of the AEF. The AEF over the 
longer, 0-3000 ms period is shown in Figure 5. 

Amplitudes of the transient responses 

The mean amplitudes of the Nlm and P2m responses 
were larger for the distorted sentences than for the un- 
distorted sentences. The mean amplitude of the Nlm 
increased from 42.8 fT/cm to 63.6 fT/cm as a result of 
sound degradation (Figure 4, F(2,16)=9.63, p<0.002). The 
P2m response was also larger for the distorted (33.9 fT/ 
cm) than for the undistorted (22.1 fT/cm) sentences 
(Figure 4, F(2,16)=7.51, /?<0.01). The effect of distortion 
on the P2m amplitude was more pronounced in the 
right hemisphere than in the left (F(2,16)=8.05, /?<0.005). 
Post-hoc tests revealed that this hemispheric asymmetry 
was due to a smaller P2m response for the undistorted 
sentences in the right hemisphere (19.1 fT/cm) than in 
the left (25.2 fT/cm; /?<0.001). Furthermore, the Nlm 
and P2m amplitudes were equally large for the first and 
second presentation of the degraded sentences, indica- 
ting that the increase in perceptual intelligibility (see 
Figure 3) was not reflected in the amplitudes of the tran- 
sient responses. 

Latencies of the transient responses 

In comparison to responses elicited by the undistorted 
sentences, degradation of sound quality resulted in earlier 
Nlm and P2m responses (Figure 4). The latency of the 
Nlm decreased from 125 ms to 114 ms (F(2, 16) =19.07, 
j9<0.0001), and that of the P2m from 202 ms to 172 ms 
(F(2,16)= 17.91, ^<0.0001). In addition, the mean latencies 
of the Nlm were longer in the left hemisphere (120 ms) 
than in the right (116 ms; F(2,16)= 11.47, ^<0.01). How- 
ever, the effect of distortion on the P2m latency was larger 
in the right hemisphere than in the left (^(2, 16) =7.04, 
/?<0.005), with the degraded stimuli eliciting the P2m re- 
sponse 15 ms earlier in the right (163 ms) than in the left 
(178 ms) hemisphere (p<0.05 in all comparisons). More- 
over, given that both the first and the second presentation 
of the distorted stimuli resulted in unvarying Nlm and 
P2m latencies, it appears that the intelligibility of the 
degraded sentences does not affect the timing of transient 
brain activity. 

Source locations of the transient responses 

Changes in the source locations of the transient responses 
were investigated by fitting a single equivalent current di- 
pole (ECD) at the responses peak latency in each hemi- 
sphere. The source locations of the Nlm and P2m 
responses were modified by sound degradation and 
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Figure 4 Grand-averaged transient evoked fields measured from the left and right hemisphere and the amplitudes and latencies of 

the Nlm and P2m responses. The distorted sentences elicited N1m responses with a larger amplitude and an earlier peak latency than their 
undistorted counterparts. The Nlm responses peaked later in the left hemisphere than in the right. The P2m responses were more prominent 
and occurred earlier for the distorted sentences than for their undistorted counterparts. Furthermore, the P2m latency and amplitude effects 
were more pronounced in the right hemisphere. The AEFs are from the sensor exhibiting maximum response, and the amplitude and latency 
results have been calculated from the vector sum from the sensor pair with the maximum response. Error bars indicate the standard error of 
the mean (SEM). 



attention. In the left hemisphere, the ECDs for the Nlm re- 
sponse were located 4.6 mm medial for the distorted stim- 
uli (#=-49.5 mm) compared to those for the undistorted 
stimuli (#=-54.0 mm; F(2,16)=4.57, p<0.05). In addition, the 
Nlm sources were more superior during active listening 
than in the passive conditions (F(l,8)=6.54, p<0.05). How- 
ever, the effect of attention on the Nlm source location 
depended on sound degradation (F(2,16)=4.23, ^><0.05). 
Post-hoc tests showed that the Nlm ECDs were more su- 
perior in the active condition only during the first presenta- 
tion of the distorted sentences (p<0.02). 

Sound degradation resulted in a 8.6-mm shift of the 
P2m sources along the anterior-posterior axis, the sources 
of the P2m being more posterior for the distorted stimuli 
(y=4.9 mm) than for the undistorted stimuli (y=13.5 mm; 
F(2,16)= 18.89, ^<0.0001). Furthermore, the P2m ECDs 
were more medial in the right hemisphere during active 
listening (#=44.0 mm) than in the passive conditions 
(#=51.1 mm; F(l,8)=9.08, /?<0.02). The effect of attention 
on the lateral-medial position of the P2m sources was 



dependent on sound degradation (F(2, 16) =4. 19, p<0.05). 
More specifically, the ECDs for the active and passive con- 
ditions diverged only when the stimuli were undistorted 
(p<0.05). 

Sustained responses 

The early (300-1000 ms) and the late (1000-3000 ms) part 
of the sustained field (SF) were analyzed separately 
(Figure 5). The early SF had a higher mean amplitude than 
its late counterpart (20.4 - 47.8 fT/cm and 13.3 - 26.4 IT/ 
cm for the early and late SF, respectively; F(l,7) =19.20, 
p<0W5). The early SF was stronger in the active condi- 
tions (34.6 fT/cm) than in the passive conditions (27.7 fT/ 
cm; F( 1,7) =7.81, p<0.05). In addition, the effect of sound 
degradation on the early SF amplitude approached statis- 
tical significance (F( 1,7) =3.67, p< 0.052), the distorted sen- 
tences yielding larger responses (33.4 fT/cm) than the 
undistorted ones (25.8 fT/cm). Post-hoc comparisons 
revealed that the early SF amplitude elicited by the first 
presentation of the degraded sentences was significantly 
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Figure 5 Grand-averaged early and late phase of the sustained field (SF) measured from the left and right hemisphere. As shown in the 
insets, the early SF (300-1000 ms) had a higher mean amplitude than the late SF (1000-3000 ms). Further, the early SF was larger in the active 
conditions than in the passive ones, and more prominent in the left hemisphere. The AEFs are from the sensor exhibiting maximum response, 
and the amplitude results have been calculated from the vector sum from the sensor pair with the maximum response. Error bars indicate the 
standard error of the mean (SEM). 



increased in comparison to the amplitude for the undis- 
torted stimuli (/?<0.05). Overall, the early SF was more 
pronounced in the left hemisphere, with this asymmetry 
approaching statistical significance (F( 1,7) =5.05, ^><0.059). 
All other statistical comparisons yielded non-significant 
results. 

Activity in auditory cortex and surrounding areas 

In the second stage of the analyses, we elucidated how ac- 
tivity in auditory cortex and the surrounding areas evolves 
over time during the processing of connected speech. To 
this end, brain activity was analyzed using minimum norm 
estimates (MNEs) during the time ranges of transient 
(Figure 6) and sustained (Figure 7) brain activity. The 
current distribution was studied by dividing the temporal 
and parietal cortical surface into 24 regions of interest 
(ROIs) and calculating the mean currents over the voxels 
inside these regions (see Figure 2). The statistical results on 



the intensity of cortical activity within specific ROIs are 
given in Table 1. A graphical summary of the effects of 
acoustic degradation and intelligibility of speech on the 
strength of cortical activity within specific ROIs is shown 
in Figure 8. 

N 1m time range 

The strongest source currents during the Nlm time range 
were localized in the vicinity of the auditory cortical areas 
(Figure 6). Cortical activity was more pronounced in the 
right hemisphere (11-12 pA/m) than in the left (7-9 pA/m) 
within the anterior inferior parietal (AIP), anterior superior 
temporal (AST) and posterior superior parietal (PSP) 
regions. Furthermore, the mean current strength was also 
higher in the right hemisphere than in the left within the 
central superior parietal (CSP) area, although this he- 
mispheric asymmetry was observed only during passive 
listening of the sentences. The mean currents during the 
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Figure 6 Noise-normalized current distribution (MNE) in the left and right hemisphere during passive listening within the Nlm and 
P2m time windows. Both responses originated in the vicinity of auditory cortex, with the Nlm being more focal than the P2m. Normalization of 
the MNE is with respect to the maximum value in the sLORETA estimate. Only the MNE voxels with a value over 50% of the maximum value in 
the sLORETA estimate are shown, and the same scaling is applied in all the estimates. 



Nlm were affected by stimulus degradation in the central 
inferior parietal (CIP) and central superior temporal (CST) 
regions. In the case of CIP, post-hoc tests indicated that 
the mean current was higher during the second presenta- 
tion of the degraded sentences (13 pA/m) than during the 
undistorted sentences (10 pA/m; p<0.05). Within the CST, 
both the first and the second presentation of degraded 
speech yielded stronger activation (14 pA/m and 15 pA/m) 
than undistorted speech (12 pA/m; p<0.02). Altogether, the 
central parts of the superior temporal and inferior parietal 
regions were responsive to acoustic degradation of speech 
during the Nlm time range. 

P2m time range 

The current distribution was more widespread during 
the P2m than during the Nlm (Figure 6). The mean 



currents were stronger in the right-hemispheric AST, 
PSP and central inferior temporal (CIT) regions (10-12 
pA/m) than in their left-hemispheric counterparts (7-9 
pA/m). Moreover, the average current strength was 
higher during passive (11 pA/m) than active (9 pA/m) 
listening within the PSP. A number of regions exhibited 
sensitivity to the intelligibility of speech in the passive 
listening condition: In the AST and CST, the cortical 
activity was weaker during the (unintelligible) first pres- 
entation of the degraded sentences (7-9 pA/m) than dur- 
ing the undistorted stimuli (10-12 pA/m) and the second 
presentation of the distorted stimuli (11-13 pA/m; post- 
hoc p<0.05, except CST unintelligible vs. undistorted: 
/?<0.058). A comparable intelligibility effect was observed 
also in the anterior inferior temporal (AIT) and poster- 
ior inferior temporal (PIT) areas, with the unintelligible 
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Late SF time range 
Left hemisphere Right hemisphere 
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1 st presentation 
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Figure 7 Noise-normalized current distribution (MNE) in the left and right hemisphere during passive listening within the early and 
late SF time windows. Activity in cortex was widespread during the generation of the early and late SF. Normalization of the MNE is with 
respect to the maximum value in the sLORETA estimate. Only the MNE voxels with a value over 50% of the maximum value in the sLORETA 
estimate are shown, and the same scaling is applied in all the estimates. 
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Table 1 Statistical analyses of the MIME data 



Effect 


F 


dfl 


df2 


P 


Nlm time range 


Hemisphere 


Anterior inferior parietal 


19,81 


1 


7 


<0.01 


Anterior superior temporal 


9,83 


1 


7 


<0.02 


Posterior superior parietal 


7,78 


1 


7 


<0.05 


Degradation 


Central inferior parietal 


4,17 


2 


14 


<0.05 


Central superior temporal 


5,62 


2 


14 


<0.02 


Hemisphere x Attention 


Central superior parietal 


7,05 


1 


7 


<0.05 


P2m time range 


Hemisphere 


Anterior superior temporal 


7,78 


1 


7 


<0.05 


Central inferior temporal 


6,44 


1 


7 


<0.05 


Posterior superior parietal 


9,37 


1 


7 


<0.02 


Attention 


Posterior superior parietal 


5,76 


1 


7 


<0.05 


Attention x Degradation 


Anterior superior temporal 


6,01 


2 


14 


<0.02 


Central superior temporal 


5,48 


2 


14 


<0.02 


Anterior inferior temporal 


4,77 


2 


14 


<0.05 


Posterior inferior temporal 


4,61 


2 


14 


<0.05 


Early SF time range 


Degradation 


Central superior temporal 


5,28 


2 


14 


<0.02 


Central inferior temporal 


3,89 


2 


14 


<0.05 


Attention x Degradation 


Central inferior parietal 


4,44 


2 


14 


<0.05 


Posterior superior temporal 


4,20 


2 


14 


<0.05 


Late SF time range 


Hemisphere 


Central superior parietal 


10,05 


1 


7 


<0.02 


Posterior superior parietal 


11,33 


1 


7 


<0.02 


Attention 


Central superior parietal 


6,84 


1 


7 


<0.05 


Attention x Degradation 


Central inferior parietal 


4,10 


2 


14 


<0.05 



ANOVA results for the hemisphere, degradation and attention of the source 
current strength within specific ROIs during the N1m, P2m, early SF and late 
SF time windows. Only statistically significant effects are shown. 



stimuli yielding weaker currents than the intelligible 
ones. In the PIT, however, only the second presentation 
of degraded speech (9 pA/m) yielded significantly stron- 
ger currents than the first presentation of the same 
stimuli (6 pA/m; p<0.05). In turn, in the AIT, only the 
undistorted sentences (7 pA/m) resulted in significantly 
stronger activation than the first presentation of 
degraded stimuli (5 pA/m; p<0.05). Taken together, these 



findings suggest that a number of regions within the tem- 
poral cortex could be sensitive to the intelligibility of 
speech during the P2m time window, given that increased 
currents seem to be associated with the presentation of in- 
telligible sentences. 

Early SF time range 

Cortical activation during the sustained field (SF) was 
distributed over a large area within the temporal and 
parietal regions (Figure 7). As demonstrated in Figure 8, 
the CST and CIT regions were sensitive to degradation 
of speech during the early SF, with the distorted sen- 
tences resulting in stronger activity (10-11 pA/m) than 
the undistorted ones (7-8 pA/m). Furthermore, the effect 
of acoustic degradation on the early SF was dependent 
on attention within the CIP and PST areas. The PST, in 
particular, was sensitive to speech intelligibility in the 
passive condition as the unintelligible first presentation 
of degraded speech (7 pA/m) yielded a weaker mean 
current than in the two other conditions (9-10 pA/m; 
/?<0.05). A similar effect was observed also in the CIP, al- 
though it failed to reach statistical significance in the 
post-hoc comparisons. Thus, it seems that PST and, pos- 
sibly, CIP are sensitive to speech intelligibility during the 
early part of the SF. 

Late SF time range 

As shown in Figure 7, activation within the CSP and 
PSP was stronger in the right hemisphere (10-12 pA/m) 
than in the left (8-10 pA/m) during the late SF. In 
addition, mean currents were higher in the CSP during 
passive (11 pA/m) than active (8 pA/m) listening of 
speech. The current strength during the late SF varied 
also with stimulus degradation in the CIP, although this 
effect depended on the listener s attentional state. More 
specifically, in the passive condition, the first presenta- 
tion of the degraded sentences resulted in weaker cur- 
rents (7 pA/m) than during the second presentation of 
the degraded stimuli (11 pA/m; p<0.06). The mean 
current during the undistorted sentences (9 pA/m) was 
also stronger than the current elicited by the first pres- 
entation of the degraded sentences, although the differ- 
ence was not statistically significant. 

Discussion 

In the current study, we set out to investigate how the 
intelligibility of connected speech is reflected in behavioral 
measures as well as in the concomitant activity in the 
auditory cortex and surrounding brain areas. By varying 
the intelligibility of the stimuli while keeping the acoustic 
features of the stimuli constant, the experimental design 
allowed us to tentatively identify cortical processing 
related to speech comprehension. Initially unintelligible, 
acoustically distorted sentences resulting in a 30% 
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Figure 8 Summary of the MNE results. Left: Regions of interest (ROIs) in the left and right hemisphere modulated by the acoustic quality of 
speech, the intelligibility of speech, and the attentive state of the listener during the N1m, P2m, Early SF and Late SF time ranges. Right: MNE 
source currents within the affected ROIs during each time range. During the Nlm time range, sensitivity to acoustic structure was evident in the 
central superior temporal (CST) and central inferior parietal (CIP) areas where stimulus degradation increased cortical activation. During the P2m 
time range, areas surrounding the CST in the superior and inferior temporal areas were more active during intelligible speech, regardless of 
whether the speech was acoustically distorted or not. In the early SF time range, the central temporal areas (CST & CIT) exhibited stronger activity 
to acoustically degraded speech while the posterior superior temporal (PST) area was sensitive to the intelligibility of speech. During the late SF, 
an attention-related effect was observable in the parietal brain areas (CSP). Error bars indicate the standard error of the mean (SEM). 



subjective intelligibility rating, were perceptually changed 
by presenting intact, undistorted versions of the sentences. 
Upon a second presentation of the acoustically distorted 
versions of the sentences, their intelligibility increased 
markedly, up to 80%. These perceptual changes were 
reflected in the transient and sustained activation of audi- 
tory cortex and surrounding brain areas. 

In the gradiometer analyses, local activity of the auditory 
cortex at 100 ms as indexed by the Nlm response was 
sensitive to the acoustic structure of speech in that the 
distorted stimuli elicited stronger activation with an earlier 
peak latency than the undistorted stimuli. An increase of 
response amplitude and decrease of latency was observed 
also at around 200 ms, in the P2m response. The ampli- 
tude and latency effects of the P2m were substantially 
more pronounced in the right hemisphere than in the left. 
These findings indicate that transient activity of the 



auditory cortex is sensitive to the acoustic properties of 
sound during the early (up to 300 ms) processing stages of 
connected speech, and that the right hemisphere is more 
sensitive to acoustic variability than the left. The initial 
transient responses were followed by a sustained response, 
arising at around 300 ms, and appearing to consist of an 
early (300-1000 ms) and a late (1000-3000 ms) phase. The 
early phase was more prominent in the left hemisphere 
and increased in amplitude when subjects attended to the 
stimuli. Compared to the preceding transient activity, the 
sustained activation was less sensitive to acoustic distor- 
tion of speech. 

In the MNE analyses, the auditory cortex and sur- 
rounding areas exhibited divergent, bilateral activity pat- 
terns associated with acoustic feature processing and 
speech intelligibility. During the Nlm time range, an in- 
crease in cortical activity due to stimulus degradation 
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was observed in regions extending from the superior 
temporal gyrus (auditory cortex; CST in the current no- 
tation) to the inferior parts of the postcentral gyrus 
(CIP). Interestingly, a number of areas within the tem- 
poral cortex were sensitive to speech intelligibility 
during the P2m time range, with the intelligible stimuli - 
both distorted and undistorted - resulting in stronger 
activity than the unintelligible stimuli. This activation 
encompassed the auditory cortex, the inferior frontal 
gyri (including Brocas area; AST), the anterior part 
of the superior temporal gyrus (AIT), and the poster- 
ior part of the inferior temporal gyrus (PIT). During 
the early phase of the SF (300-1000 ms), the auditory 
cortex was more active in response to the distorted 
than the undistorted sentences, regardless of their in- 
telligibility. In contrast, cortical activity in the poster- 
ior parts of the superior temporal gyrus (including 
Wernicke's area; PST) was stronger only during intel- 
ligible speech, regardless of whether the stimulus ma- 
terial was acoustically intact or distorted. 

In the present experiment, the stimuli were distorted 
by using amplitude quantization, which has been shown 
to decrease substantially the intelligibility of isolated 
speech sounds (see, e.g. [18,43]). This was also the case 
in the current study, as the distorted sentences were ini- 
tially very difficult to understand. However, after the 
subject was exposed to the undistorted versions of the 
sentences, the comprehensibility of the distorted sen- 
tences increased considerably. It is unlikely that the in- 
telligibility effect seen in both behavioral and brain 
measures is an effect due solely to the repetition of the 
distorted stimuli given that the gap between repetition 
(i.e., between Session 1 and 3) was around 20 minutes. 
This time span makes it improbable that the subject 
could have been drawing on any echoic or short-term 
memory resources. Instead, this increase in comprehen- 
sion was most likely caused by top-down mechanisms 
utilizing the long-term memory representations which 
were instantly activated (or primed; e.g. [44]) during lis- 
tening to the intact versions of the stimuli. Similar 
changes in the perception of acoustically identical 
speech-like stimuli have been observed also using noise- 
vocoded sentences [5,22] and sine-wave speech stimuli 
[45]. However, in these cases the perceptual changes 
were brought about through extended training sessions, 
whereas in the current context, these effects were imme- 
diate, and observable after already a single presentation 
of the undistorted versions of the stimuli. Thus, depend- 
ing on the experimental setup, it now appears to be pos- 
sible to study brain mechanisms of perceptual learning 
occurring over a long time scale as well as rapid activa- 
tion of linguistic memory representations. 

The changes in the acoustic structure of the speech 
stimuli brought about by distortion were reflected in 



both the transient and sustained activation patterns of 
the auditory areas. In contrast, the temporal regions 
anterior and posterior to auditory cortex (area CST) 
were insensitive to degradation. The observed increase 
in the amplitude of the transient responses is in line 
with earlier results employing the same distortion 
method [17-19]. These studies have demonstrated that 
the amplitude increase of the Nlm and P2m responses 
is related to an increase in harmonic frequencies in the 
signal spectrum brought about by quantization. Accord- 
ing to this explanation, the additional harmonics activate 
a larger number of neurons involved in the pitch extrac- 
tion process. In the current study the latency of the tran- 
sient responses was also affected by the distortion, with 
earlier Nlm and P2m latencies for the distorted sen- 
tences. This finding deviates from our earlier results 
using isolated speech sounds (-200 ms vowel sounds), 
for which the response latencies remained unchanged 
when the stimuli were distorted. One reason for these 
differences may lie in the experimental design: in previ- 
ous studies by Miettinen et al. [17-19], short-duration 
isolated vowels were repeated at a fast rate whereas in 
the current case long-duration sentences with a com- 
plex, continually evolving spectral structure were pre- 
sented with intervening long silent periods. Similar 
latency results were recently reported by Obleser and 
Kotz [46], who found that the Nlm response peaks earl- 
ier and is larger in amplitude for distorted sentences 
than for their undistorted counterparts. 

In the present experiment, the auditory cortex was 
highly responsive to distortion of speech, which is consist- 
ent with prior hemodynamic studies showing that the core 
auditory areas are sensitive to acoustic differences in 
speech stimuli [21-28]. The regions surrounding the audi- 
tory cortex, in turn, were sensitive to the intelligibility of 
speech, with stronger activation elicited by intelligible 
speech regardless of whether the stimulus material was 
distorted. These findings are congruent with the above 
fMRI results, in particular with those by Okada et al. [25], 
who observed a bilateral sensitivity of both the anterior 
and posterior superior temporal regions to speech intelligi- 
bility. Importantly, we observed that, already during the 
P2m time range, areas in the vicinity of the auditory cortex 
were sensitive to speech intelligibility as well (see Figure 8). 
This intelligibility effect, observable presumably because 
of the temporal resolution of the MEG, might reflect the 
influence of top-down feedback from higher-order cor- 
tical areas on the activity of auditory cortex. Similar find- 
ings have also been reported by Wild et al. [30] and 
Sohoglu et al. [47], who demonstrated that prior expecta- 
tions of speech content modulate the activity of auditory 
cortex during listening to distorted speech. 

The novel experimental paradigm introduced here 
points to several interesting possibilities for future 
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research. Firstly, one should keep in mind that the 
current intelligibility effects in cortical activity were 
observed in the passive condition which always followed 
the active condition, and it therefore remains to be clari- 
fied whether there were carry-over effects from one to 
the other. This interesting issue, related to the decay 
time of recognition memory, clearly deserves further 
study. Secondly, an important question for future inves- 
tigation is how the number of sentences used in the 
experiment affects intelligibility and behavioral perform- 
ance. Assuming the memory system probed with the 
current paradigm has a capacity limitation, increasing 
the number of sentences should at some point lead to 
decreased performance. Indeed, the intelligibility of the 
sentences in the current study might have been facili- 
tated by the limited number of words and sentence stubs 
used to construct the stimulus material. Thirdly, in 
studying the priming of memory representations of 
speech, a further step, requiring a larger set of sentences 
than in the current case, would be to average brain 
responses selectively based on the behavioral perform- 
ance (in terms of unintelligible vs. intelligible sentences), 
and to study how this is reflected in the activation of 
brain areas. We expect that this approach would lead to 
even more pronounced intelligibility effects in cortical 
activity than those reported here. 

Conclusions 

The current study utilized an experimental setting which 
allowed for physically identical, distorted speech stimuli 
to be perceived as either unintelligible or intelligible due 
to a single intervening exposure to the undistorted ver- 
sions of the stimuli. In the Nlm time range (-100 ms), 
the auditory areas within the superior temporal sulcus 
seem to be sensitive to acoustic degradation of speech. 
Thereafter, in the time range of P2m (200-300 ms), audi- 
tory cortex as well as cortical regions anterior and pos- 
terior to this area appear to be responsive to the 
intelligibility of speech. Following this transient brain ac- 
tivity, during the early SF (at 300-1000 ms), the region 
most sensitive to speech intelligibility was located in the 
posterior part of the superior temporal gyrus of each 
hemisphere. In terms of auditory scene analysis [1], the 
current experimental setup could be seen as providing a 
new methodological approach to studies of auditory 
scene analysis. This phenomenon has traditionally been 
studied using sequencing of auditory stimuli alternating 
in, for example, frequency, intensity, or spatial location 
(see, e.g. [48]), whereas the experimental distorted- 
undistorted-distorted setup proposed here allows one to 
study how meaningful entities such as speech are segre- 
gated from a noisy signal by matching incoming acoustic 
signals to memory representations. Our results indicate 
that in building a coherent whole from various auditory 



confluences, the acoustic attributes of incoming auditory 
signals are identified rapidly in the auditory cortex 
within the first 100 ms. The activity of the auditory cor- 
tex appears to be modulated through feedback connec- 
tions, as indicated by the fact that already at 200 ms 
sensitivity to speech intelligibility emerges in this and 
surrounding areas. In making the unintelligible suddenly 
intelligible, these modulations can result in substantial 
changes in the way we perceive connected speech. 
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